Access published February 3 , 2007
ثبت نشده
چکیده
Examination of all against all comparisons (see below) using this scoring function found that some families consistently scored higher than others. For example some families of membrane proteins would score more highly against other membrane proteins probably due to biases in sequence composition. Therefore, for each pairwise comparison of search outputs the following normali-zation is carried out to give the 'normalised score': () B A raw B A norm B A S S S S , max , , = Where S A is calculated as: () N S raw x A + ∑ 100 1 , x Where N is the number of outputs matched. This normalisation reduces the score of any pair of profiles when either or both of them tends to score highly against a large number of other profiles. To evaluate the SCOOP method we used the search outputs of all the HMMER profile-HMM models in Pfam version 18.0 to find relationships. The search output files were constructed by searching all Pfam profile-HMMs against the Pfam sequence database that is based on Uniprot (Swiss-Prot 47.0 and SP-TrEMBL 30.0) (Wu, et al., 2006). The HMMER software was used in ls mode with a gathering E-value of 1000. To benchmark the performance of the method we compared it with the following profile-profile We considered using the COACH software (Edgar, et al., 2004), but it is designed for alignment of families rather than scoring their similarity (Robert Edgar personal communication). For each of these software tools we carried out an all against all comparison of Pfam release 18.0. For PRC we used the version 1.5.2. using local-local mode with Viterbi alignments. For COMPASS we used version 2.332 and for HHsearch version 1.2.0 was used, both with default options. The SCOOP method was implemented in Perl and requires a flat-file of matches from all search output files ordered by protein matched. The accuracy of the SCOOP method increases when there are a large number of regions matched to get enough statistics. Performance is degraded as the number of sequences in the underlying sequence database is reduced (data not shown). The use of the normalisation results in a significant performance increase but requires a large (greater than 1000) set of profile-HMM outputs. The program performance was optimised to be run with the data from a large number of search outputs on a underlying sequence database, greater than 1 000 000 sequences. …
منابع مشابه
Yukari Ohki , Bror Alstermark , Lars - Gunnar Pettersson and Shigeto Sasaki Control of Hand / Arm Movements Direct and Indirect Cortico - Motoneuronal
Physiol. Soc.. ESSN: 1548-9221. Visit our website at http://www.the-aps.org/. Sci./Am. American Physiological Society, 9650 Rockville Pike, Bethesda MD 20814-3991. © 2007 Int. Union Physiol. the physiological developments. It is published bimonthly in February, April, June, August, October, and December by (formerly published as News in Physiological Science) publishes brief review articles on ...
متن کاملBiotic stoichiometric controls on the deep ocean N:P ratio
Biotic stoichiometric controls on the deep ocean N:P ratio T. M. Lenton and C. A. Klausmeier School of Environmental Sciences, University of East Anglia, Norwich NR4 7TJ, UK W. K. Kellogg Biological Station, Michigan State University, Hickory Corners, MI 49060, USA Received: 1 February 2007 – Accepted: 5 February 2007 – Published: 8 February 2007 Correspondence to: T. M. Lenton ([email protected]...
متن کامل